Search CORE

725 research outputs found

RVD2: An ultra-sensitive variant detection model for low-depth heterogeneous next-generation sequencing data

Author: He Yuting
Publication venue: Digital WPI
Publication date: 29/04/2014
Field of study

Motivation: Next-generation sequencing technology is increasingly being used for clinical diagnostic tests. Unlike research cell lines, clinical samples are often genomically heterogeneous due to low sample purity or the presence of genetic subpopulations. Therefore, a variant calling algorithm for calling low-frequency polymorphisms in heterogeneous samples is needed. Result: We present a novel variant calling algorithm that uses a hierarchical Bayesian model to estimate allele frequency and call variants in heterogeneous samples. We show that our algorithm improves upon current classifiers and has higher sensitivity and specificity over a wide range of median read depth and minor allele frequency. We apply our model and identify twelve mutations in the PAXP1 gene in a matched clinical breast ductal carcinoma tumor sample; two of which are loss-of-heterozygosity events

DigitalCommons@WPI

Recommended from our members

RVD2: an ultra-sensitive variant detection model for low-depth heterogeneous next-generation sequencing data

Author: Flaherty Patrick
He Yuting
Zhang Fan
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2015
Field of study

MOTIVATION: Next-generation sequencing technology is increasingly being used for clinical diagnostic tests. Clinical samples are often genomically heterogeneous due to low sample purity or the presence of genetic subpopulations. Therefore, a variant calling algorithm for calling low-frequency polymorphisms in heterogeneous samples is needed. RESULTS: We present a novel variant calling algorithm that uses a hierarchical Bayesian model to estimate allele frequency and call variants in heterogeneous samples. We show that our algorithm improves upon current classifiers and has higher sensitivity and specificity over a wide range of median read depth and minor allele fraction. We apply our model and identify 15 mutated loci in the PAXP1 gene in a matched clinical breast ductal carcinoma tumor sample; two of which are likely loss-of-heterozygosity events

ScholarWorks@UMass Amherst

PubMed Central

Liver Fibrosis Surface Assessment Based on Non-Linear Optical Microscopy

Author: HE YUTING
Publication venue
Publication date: 05/08/2010
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Predicting the amount of coke deposition on catalyst through image analysis and soft computing

Author: He Yuting
Yan Yong
Zhang Jingqiong
Zhang Wenbiao
Publication venue: 'IOP Publishing'
Publication date: 19/09/2016
Field of study

The amount of coke deposition on catalyst pellets is one of the most important indexes of catalytic property and service life. As a result, it is essential to measure this and analyze the active state of the catalysts during a continuous production process. This paper proposes a new method to predict the amount of coke deposition on catalyst pellets based on image analysis and soft computing. An image acquisition system consisting of a flatbed scanner and an opaque cover is used to obtain catalyst images. After imaging processing and feature extraction, twelve effective features are selected and two best feature sets are determined by the prediction tests. A neural network optimized by a particle swarm optimization algorithm is used to establish the prediction model of the coke amount based on various datasets. The root mean square error of the prediction values are all below 0.021 and the coefficient of determination R 2, for the model, are all above 78.71%. Therefore, a feasible, effective and precise method is demonstrated, which may be applied to realize the real-time measurement of coke deposition based on on-line sampling and fast image analysis

Kent Academic Repository

Experimental investigation of the isothermal section at 400 °C of the MgCeSr ternary system

Author: Du Yong
He Cuiyun
Qin Yuting
Sun Songxiang
Zhou Hua
Publication venue: Production and hosting by Elsevier B.V.
Publication date: 31/03/2016
Field of study

AbstractThe objective of this study is to determine the isothermal section at 400 °C of the MgCeSr system. In this study, the constitution of the CeSr system and the MgCeSr system have been investigated over the entire composition range using X-ray diffraction (XRD), field emission scanning electron microscope (SEM) and energy dispersive spectroscopy (EDS). No any new binary compound has been found in the CeSr system and no ternary compound has been found in the MgCeSr system also. Nine three-phase regions have been experimentally observed. Six binary phases Mg2Sr, Mg23Sr6, Mg38Sr9, Mg17Sr2, Mg12Ce, Mg41Ce5 are detected dissolving about 3–7 at.% the third element. This study first detected the experimental data of the CeSr binary system and determined the isothermal section at 400 °C of the MgCeSr ternary system

Elsevier - Publisher Connector

Highly-Accurate Electricity Load Estimation via Knowledge Aggregation

Author: Deng Song
Ding Yuting
He Yi
Luo Xin
Wu Di
Publication venue
Publication date: 06/12/2022
Field of study

Mid-term and long-term electric energy demand prediction is essential for the planning and operations of the smart grid system. Mainly in countries where the power system operates in a deregulated environment. Traditional forecasting models fail to incorporate external knowledge while modern data-driven ignore the interpretation of the model, and the load series can be influenced by many complex factors making it difficult to cope with the highly unstable and nonlinear power load series. To address the forecasting problem, we propose a more accurate district level load prediction model Based on domain knowledge and the idea of decomposition and ensemble. Its main idea is three-fold: a) According to the non-stationary characteristics of load time series with obvious cyclicality and periodicity, decompose into series with actual economic meaning and then carry out load analysis and forecast. 2) Kernel Principal Component Analysis(KPCA) is applied to extract the principal components of the weather and calendar rule feature sets to realize data dimensionality reduction. 3) Give full play to the advantages of various models based on the domain knowledge and propose a hybrid model(XASXG) based on Autoregressive Integrated Moving Average model(ARIMA), support vector regression(SVR) and Extreme gradient boosting model(XGBoost). With such designs, it accurately forecasts the electricity demand in spite of their highly unstable characteristic. We compared our method with nine benchmark methods, including classical statistical models as well as state-of-the-art models based on machine learning, on the real time series of monthly electricity demand in four Chinese cities. The empirical study shows that the proposed hybrid model is superior to all competitors in terms of accuracy and prediction bias

arXiv.org e-Print Archive

Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation

Author: He Yuting
Qi Xiaoming
Qi Yaolei
Yang Guanyu
Zhang Yuan
Publication venue
Publication date: 18/08/2023
Field of study

Accurate segmentation of topological tubular structures, such as blood vessels and roads, is crucial in various fields, ensuring accuracy and efficiency in downstream tasks. However, many factors complicate the task, including thin local structures and variable global morphologies. In this work, we note the specificity of tubular structures and use this knowledge to guide our DSCNet to simultaneously enhance perception in three stages: feature extraction, feature fusion, and loss constraint. First, we propose a dynamic snake convolution to accurately capture the features of tubular structures by adaptively focusing on slender and tortuous local structures. Subsequently, we propose a multi-view feature fusion strategy to complement the attention to features from multiple perspectives during feature fusion, ensuring the retention of important information from different global morphologies. Finally, a continuity constraint loss function, based on persistent homology, is proposed to constrain the topological continuity of the segmentation better. Experiments on 2D and 3D datasets show that our DSCNet provides better accuracy and continuity on the tubular structure segmentation task compared with several methods. Our codes will be publicly available.Comment: Accepted by ICCV 202

arXiv.org e-Print Archive